Search CORE

1,574 research outputs found

Social and Transmission Contact Network Analysis of Epidemic Dynamics in Agent-Based Models

Author: Huang Jiawei
Publication venue
Publication date: 30/01/2012
Field of study

This thesis aims to do social network analysis on synthetic population that is used in FRED system and develop transmission network analysis tools to analyze disease epidemic dynamics in agent-based models of infectious disease. The social network of synthetic Allegheny County population consists 1.2M agents. The synthetic population is proved to be an integrated component with average shortest path is 6.91. The risks of being infected for age groups are positively related to the average degree of each group. Although degree distribution has bifurcating pattern, it is still reasonable and conservative to use the synthetic population for modeling disease transmission. Tools are developed to analyze the transmission network, which generated by FRED simulations. Three tools, TraceAnalysis, StatisticalAnalysis and EpidemicDynamicPlot, were developed to calculate statistics of transmission networks, to make inference on statistics from different simulation scenarios and to plot epidemic curves. The tools are used to analyze the effectiveness of public policies. School closure and vaccination policies were chosen from FRED to be compared with the baseline FRED, which is run with no intervention policy. The results from network analysis tools indicate the dependency between agents' infection location. Special transmission patterns can be found by comparing the discrepancy between the number and expected number of transmission patterns. The public health importance of network analysis tools is to find out the contact tracing motifs, to reveal the strong dependency of locations where infection events happened and to compare the effectiveness of different public intervention policies in agent-based models

D-Scholarship@Pitt

Robust Knowledge Transfer in Tiered Reinforcement Learning

Author: He Niao
Huang Jiawei
Publication venue
Publication date: 10/10/2023
Field of study

In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the ``Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space.Comment: 46 Pages; 1 Figure; NeurIPS 202

arXiv.org e-Print Archive

Is Simple Uniform Sampling Efficient for Center-Based Clustering With Outliers: When and Why?

Author: Ding Hu
Huang Jiawei
Publication venue
Publication date: 05/03/2021
Field of study

Clustering has many important applications in computer science, but real-world datasets often contain outliers. The presence of outliers can make the clustering problems to be much more challenging. In this paper, we propose a framework for solving three representative center-based clustering with outliers problems:

k

-center/median/means clustering with outliers. The framework actually is very simple, where we just need to take a small uniform sample from the input and run an existing approximation algorithm on the sample. However, our analysis is fundamentally different from the previous (uniform and non-uniform) sampling based ideas. To explain the effectiveness of uniform sampling in theory, we introduce a "significance" criterion and prove that the performance of our framework depends on the significance degree of the given instance. In particular, the sample size can be independent of the input data size

n

and the dimensionality

d

, if we assume the given instance is sufficiently "significant", which is in fact a fairly appropriate assumption in practice. Due to its simplicity, the uniform sampling approach also enjoys several significant advantages over the non-uniform sampling approaches. The experiments suggest that our framework can achieve comparable clustering results with existing methods, but is much easier to implement and can greatly reduce the running times. To the best of our knowledge, this is the first work that systematically studies the effectiveness of uniform sampling from both theoretical and experimental aspects.Comment: arXiv admin note: text overlap with arXiv:1905.1014

arXiv.org e-Print Archive

Coresets for Wasserstein Distributionally Robust Optimization Problems

Author: Ding Hu
Huang Jiawei
Huang Ruomin
Liu Wenjie
Publication venue
Publication date: 09/10/2022
Field of study

Wasserstein distributionally robust optimization (\textsf{WDRO}) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of \textsf{WDRO} can be prohibitive in practice since solving its ``minimax'' formulation requires a great amount of computation. Recently, several fast \textsf{WDRO} training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale \textsf{WDRO}s is still quite limited, to the best of our knowledge. \textit{Coreset} is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the

\epsilon

-coreset for the general \textsf{WDRO} problems. Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original \textsf{WDRO} objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of \textsf{WDRO}. Finally, we implement our coreset approach and illustrate its effectiveness for several \textsf{WDRO} problems in the experiments

arXiv.org e-Print Archive